new york elite
Playing Large Games with Oracles and AI Debate
Chen, Xinyi, Chen, Angelica, Foster, Dean, Hazan, Elad
We consider regret minimization in repeated games with a very large number of actions. Such games are inherent in the setting of AI safety via debate, and more generally games whose actions are language-based. Existing algorithms for online game playing require computation polynomial in the number of actions, which can be prohibitive for large games. We thus consider oracle-based algorithms, as oracles naturally model access to AI agents. With oracle access, we characterize when internal and external regret can be minimized efficiently. We give a novel efficient algorithm for internal regret minimization whose regret and computation complexity depend logarithmically on the number of actions. This implies efficient oracle-based computation of a correlated equilibrium in large games. We conclude with experiments in the setting of AI Safety via Debate that shows the benefit of insights from our algorithmic analysis.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Finland > Uusimaa > Helsinki (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)